Retrospective studies that analyze mortality outcomes frequently rely on the National Death Index (NDI) to ascertain vital status, as well as date and cause of death [1,2,3]. It is managed by the US National Center for Health Statistics, and since 1979 has been used in more than 1000 health research publications [4]. Information on the cause of death in this database relies upon death certificates that frequently have classification errors, and are almost always completed by individuals who did not have a formal clinical relationship with the decedent. Yet, investigations that evaluate prostate cancer-specific mortality (PCSM) rates commonly reference data sources that rely upon death certificates [5,6,7,8,9]. This is even though a review by medical examiners has found that up to 60% of death certificates contain errors in the underlying cause-of-death code [10]. This high error rate has been attributed to the characteristics of death certifiers [11], and has led to concerns about the accuracy of cancer-specific mortality statistics that have been found to be erroneous in the literature [12, 13]. Herein, we investigate the impact of this known misclassification of cause of death on studies evaluating PCSM following prostatectomy.

The Shared Equal Access Regional Cancer Hospital (SEARCH) database is a retrospective, IRB-approved database that maintains records on 5009 Veterans who underwent a radical prostatectomy (RP) at eight Veterans Affairs Medical Centers between 1982 and 2016. It is routinely updated with manual reviews of individual patient electronic medical records available through the Veterans Health Administration (VHA) that includes information on labs, pathology, radiology, and clinical notes, and is coded by personnel who are trained by the SEARCH team. Data are abstracted according to detailed guidelines that were developed by urologic oncologists, and 10% of cases are routinely audited to ensure the validity and consistency of data collection. Cause of death in the SEARCH database is coded as ā€œprostate cancerā€ whenever there was progressive metastatic disease following hormonal therapy without another obvious cause of death, ā€œotherā€ whenever a death related to prostate cancer could be ruled out, and ā€œunknown or otherā€ whenever information was incomplete. Whenever in doubt, cases are reviewed by a urologic oncologist who specializes in prostate cancer (SJF, WJA).

We cross-referenced data from the SEARCH database coded between 1989 and 2011 with the NDI for vital status, cause of death, and date of death. Social security numbers were used to reconcile this information, and the analysis relied on ICD-9 codes to match the cause-of-death. A total of 1327 and 1093 patients were coded as deceased in SEARCH and NDI, respectively. All deceased patients in the NDI were coded as dead in SEARCH, though 17% (219) of deaths in SEARCH were listed as alive in the NDI (see TableĀ 1). This rate of incongruence for vital status decreased to 4% after 1998 (see Fig.Ā 1). For patients with concordant death status, the dates were an exact match, within one day, within a week, or within 31 days for 94%, 97%, 99%, and 100%, respectively.

Table 1 Cross tabulation of death status in SEARCH vs. NDI
Fig. 1
figure 1

Histogram showing the number of death statuses that agree between SEARCH and NDI and deaths missed by NDI between 1988 and 2011

NDI and SEARCH were in concordance on 941 deaths from other causes and 92 PC deaths (see TableĀ 2). However, 13 patients were misclassified by SEARCH as having died of prostate cancer and 47 were misclassified by NDI as dying from non-PC causes. Of the 104 patients coded in SEARCH as dead from PC, the NDI coded 77% accurately, with 11% as non-PC, and 12% as alive. Of the 139 patients coded in the NDI as dead from PC, SEARCH confirmed 66% to be accurate, with 34% as non-PC, and 0% as alive. Dropping patients with an unknown cause of death in either source, the positive predictive value for PC death in NDI was 72% and the negative predictive value was 98%.

Table 2 Cross tabulation of cause of death in SEARCH vs. NDI

Previous evaluations have shown variable rates of agreement between the cause of death coded in death certificates versus available hospital records. Uncertainties during completion of death certificates and transcription errors contribute to challenges in coding cause of death. Yet, challenges are present even in prospective clinical trials when complete clinical records are available. For example, the randomized PIVOT trial utilized an Endpoints Committee comprising an internist and two urologists to assign the cause of death. Each committee member was presented with ā€œdeath summary packetsā€ that included a death certificate; a brief report of the site investigatorā€™s conclusions regarding cause of death; study PSA values; study bone scan results; and relevant medical records, including discharge summaries, radiology reports, and pathology reports, particularly when another type of cancer was suspected. Even with this amount of detailed clinical information, the committee members agreed in only 56% of cases before discussing whether the death was ā€œdefinitelyā€ or ā€œprobablyā€ due to prostate cancer or treatment. Agreement increased to 86% when the ā€œdefinitelyā€ and ā€œprobablyā€ assignments were collapsed (a strategy that was defined a priori), maintaining concerns about the reliability of the PCSM endpoint even in this phase III trial. As this experience indicates, ascertaining the of cause of death, even in a prospective clinical trial, can be challenging and is an inexact science. As such, there is the possibility of misclassification in SEARCH as well, given clinical records may be missing for decedents who received their care outside the VA at the end of life. There may be scenarios where the individuals completing death certificates had access to more meaningful clinical records just before death, though our results included an analysis that excluded patients in SEARCH that had an ā€œunknownā€ cause of death.

Using cause-specific survival, as opposed to overall survival, to evaluate treatment efficacy has the advantage of minimizing biases introduced by competing causes of death. However, deaths from prostate cancer are frequently misclassified, especially in databases such as the NDI which rely on death certificates, and thus studies using a PCSM endpoint should be interpreted with caution.